74 research outputs found
Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection
Multispectral pedestrian detection has received extensive attention in recent
years as a promising solution to facilitate robust human target detection for
around-the-clock applications (e.g. security surveillance and autonomous
driving). In this paper, we demonstrate illumination information encoded in
multispectral images can be utilized to significantly boost performance of
pedestrian detection. A novel illumination-aware weighting mechanism is present
to accurately depict illumination condition of a scene. Such illumination
information is incorporated into two-stream deep convolutional neural networks
to learn multispectral human-related features under different illumination
conditions (daytime and nighttime). Moreover, we utilized illumination
information together with multispectral data to generate more accurate semantic
segmentation which are used to boost pedestrian detection accuracy. Putting all
of the pieces together, we present a powerful framework for multispectral
pedestrian detection based on multi-task learning of illumination-aware
pedestrian detection and semantic segmentation. Our proposed method is trained
end-to-end using a well-designed multi-task loss function and outperforms
state-of-the-art approaches on KAIST multispectral pedestrian dataset
Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection
Effective fusion of complementary information captured by multi-modal sensors
(visible and infrared cameras) enables robust pedestrian detection under
various surveillance situations (e.g. daytime and nighttime). In this paper, we
present a novel box-level segmentation supervised learning framework for
accurate and real-time multispectral pedestrian detection by incorporating
features extracted in visible and infrared channels. Specifically, our method
takes pairs of aligned visible and infrared images with easily obtained
bounding box annotations as input and estimates accurate prediction maps to
highlight the existence of pedestrians. It offers two major advantages over the
existing anchor box based multispectral detection methods. Firstly, it
overcomes the hyperparameter setting problem occurred during the training phase
of anchor box based detectors and can obtain more accurate detection results,
especially for small and occluded pedestrian instances. Secondly, it is capable
of generating accurate detection results using small-size input images, leading
to improvement of computational efficiency for real-time autonomous driving
applications. Experimental results on KAIST multispectral dataset show that our
proposed method outperforms state-of-the-art approaches in terms of both
accuracy and speed
Unsupervised Domain Adaptation for Multispectral Pedestrian Detection
Multimodal information (e.g., visible and thermal) can generate robust
pedestrian detections to facilitate around-the-clock computer vision
applications, such as autonomous driving and video surveillance. However, it
still remains a crucial challenge to train a reliable detector working well in
different multispectral pedestrian datasets without manual annotations. In this
paper, we propose a novel unsupervised domain adaptation framework for
multispectral pedestrian detection, by iteratively generating pseudo
annotations and updating the parameters of our designed multispectral
pedestrian detector on target domain. Pseudo annotations are generated using
the detector trained on source domain, and then updated by fixing the
parameters of detector and minimizing the cross entropy loss without
back-propagation. Training labels are generated using the pseudo annotations by
considering the characteristics of similarity and complementarity between
well-aligned visible and infrared image pairs. The parameters of detector are
updated using the generated labels by minimizing our defined multi-detection
loss function with back-propagation. The optimal parameters of detector can be
obtained after iteratively updating the pseudo annotations and parameters.
Experimental results show that our proposed unsupervised multimodal domain
adaptation method achieves significantly higher detection performance than the
approach without domain adaptation, and is competitive with the supervised
multispectral pedestrian detectors
Noise-Tolerant Unsupervised Adapter for Vision-Language Models
Recent advances in large-scale vision-language models have achieved very
impressive performance in various zero-shot image classification tasks. While
prior studies have demonstrated significant improvements by introducing
few-shot labelled target samples, they still require labelling of target
samples, which greatly degrades their scalability while handling various visual
recognition tasks. We design NtUA, a Noise-tolerant Unsupervised Adapter that
allows learning superior target models with few-shot unlabelled target samples.
NtUA works as a key-value cache that formulates visual features and predicted
pseudo-labels of the few-shot unlabelled target samples as key-value pairs. It
consists of two complementary designs. The first is adaptive cache formation
that combats pseudo-label noises by weighting the key-value pairs according to
their prediction confidence. The second is pseudo-label rectification, which
corrects both pair values (i.e., pseudo-labels) and cache weights by leveraging
knowledge distillation from large-scale vision language models. Extensive
experiments show that NtUA achieves superior performance consistently across
multiple widely adopted benchmarks
MLAN: Multi-Level Adversarial Network for Domain Adaptive Semantic Segmentation
Recent progresses in domain adaptive semantic segmentation demonstrate the
effectiveness of adversarial learning (AL) in unsupervised domain adaptation.
However, most adversarial learning based methods align source and target
distributions at a global image level but neglect the inconsistency around
local image regions. This paper presents a novel multi-level adversarial
network (MLAN) that aims to address inter-domain inconsistency at both global
image level and local region level optimally. MLAN has two novel designs,
namely, region-level adversarial learning (RL-AL) and co-regularized
adversarial learning (CR-AL). Specifically, RL-AL models prototypical regional
context-relations explicitly in the feature space of a labelled source domain
and transfers them to an unlabelled target domain via adversarial learning.
CR-AL fuses region-level AL and image-level AL optimally via mutual
regularization. In addition, we design a multi-level consistency map that can
guide domain adaptation in both input space (, image-to-image
translation) and output space (, self-training) effectively. Extensive
experiments show that MLAN outperforms the state-of-the-art with a large margin
consistently across multiple datasets.Comment: Submitted to P
Genetic analysis and population structure of wild and cultivated wishbone flower (Torenia fournieri Lind.) lines related to specific floral color
Background The wishbone flower or Torenia fournieri Lind., an annual from tropical Indochina and southern China, is a popular ornamental plant, and many interspecific (T. fournieri × T. concolor) hybrid lines have been bred for the international market. The cultivated lines show a pattern of genetic similarity that correlates with floral color which informs on future breeding strategies. This study aimed to perform genetic analysis and population structure of cultivated hybrid lines comparing with closely related T. concolor wild populations. Methods We applied the retrotransposon based iPBS marker system for genotyping of a total of 136 accessions from 17 lines/populations of Torenia. These included 15 cultivated lines of three series: Duchess (A, B, C); Kauai (D, E, F, G, H, I, J); Little Kiss (K, L, M, N, P) and two wild T. concolor populations (Q and R). PCR products from each individual were applied to estimate the genetic diversity and differentiation between lines/populations. Results Genotyping results showed a pattern of genetic variation differentiating the 17 lines/populations characterized by their specific floral colors. The final PCoA analysis, phylogenetic tree construction, and Bayesian population structural bar plot all showed a clear subdivision of lines/populations analysed. The 15 cultivated hybrid lines and the wild population Q that collected from a small area showed the lowest genetic variability while the other wild population R which sampled from a larger area had the highest genetic variability. Discussion The extremely low genetic variability of 15 cultivated lines indicated that individual line has similar reduction in diversity/heterozygosity from a bottleneck event, and each retained a similar (but different from each other) content of the wild genetic diversity. The genetic variance for the two wild T. concolor populations could be due to our varied sampling methods. The two wild populations (Q, R) and the cultivated hybrid lines (I, K, M, N, P) are genetically more closely related, but strong positive correlations presented in cultivated lines A, C, E, M, and N. These results could be used to guide future Torenia breeding. Conclusions The genetic variation and population structure found in our study showed that cultivated hybrid lines had similar reduction in diversity/heterozygosity from a bottleneck event and each line retained a similar (but different from each other) content of the wild genetic diversity, especially when strong phenotypic selection of floral color overlaps. Generally, environmental factors could induce transposon activation and generate genetic variability which enabled the acceleration of the evolutionary process of wild Torenia species. Our study revealed that wild Torenia populations sampled from broad geographic region represent stronger species strength with outstanding genetic diversity, but selective breeding targeting a specific floral color decreased such genetic variability
- …